208 research outputs found
Sampled Weighted Min-Hashing for Large-Scale Topic Mining
We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to
automatically mine topics from large-scale corpora. SWMH generates multiple
random partitions of the corpus vocabulary based on term co-occurrence and
agglomerates highly overlapping inter-partition cells to produce the mined
topics. While other approaches define a topic as a probabilistic distribution
over a vocabulary, SWMH topics are ordered subsets of such vocabulary.
Interestingly, the topics mined by SWMH underlie themes from the corpus at
different levels of granularity. We extensively evaluate the meaningfulness of
the mined topics both qualitatively and quantitatively on the NIPS (1.7 K
documents), 20 Newsgroups (20 K), Reuters (800 K) and Wikipedia (4 M) corpora.
Additionally, we compare the quality of SWMH with Online LDA topics for
document representation in classification.Comment: 10 pages, Proceedings of the Mexican Conference on Pattern
Recognition 201
Powerpropagation: A sparsity inducing weight reparameterisation
The training of sparse neural networks is becoming an increasingly important tool
for reducing the computational footprint of models at training and evaluation, as
well enabling the effective scaling up of models. Whereas much work over the
years has been dedicated to specialised pruning techniques, little attention has
been paid to the inherent effect of gradient based training on model sparsity. In
this work, we introduce Powerpropagation, a new weight-parameterisation for
neural networks that leads to inherently sparse models. Exploiting the behaviour
of gradient descent, our method gives rise to weight updates exhibiting a “rich get
richer” dynamic, leaving low-magnitude parameters largely unaffected by learning.
Models trained in this manner exhibit similar performance, but have a distribution
with markedly higher density at zero, allowing more parameters to be pruned safely.
Powerpropagation is general, intuitive, cheap and straight-forward to implement
and can readily be combined with various other techniques. To highlight its versatility, we explore it in two very different settings: Firstly, following a recent
line of work, we investigate its effect on sparse training for resource-constrained
settings. Here, we combine Powerpropagation with a traditional weight-pruning
technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing
superior performance on the ImageNet benchmark. Secondly, we advocate the use
of sparsity in overcoming catastrophic forgetting, where compressed representations allow accommodating a large number of tasks at fixed model capacity. In all
cases our reparameterisation considerably increases the efficacy of the off-the-shelf
methods
Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Equivariant Projected Kernels
Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, including Riemannian manifolds, such as spheres and tori. We propose techniques that generalize this class to model vector fields on Riemannian manifolds, which are important in a number of application areas in the physical sciences. To do so, we present a general recipe for constructing gauge equivariant kernels, which induce Gaussian vector fields, i.e. vector-valued Gaussian processes coherent with geometry, from scalar-valued Riemannian kernels. We extend standard Gaussian process training methods, such as variational inference, to this setting. This enables vector-valued Gaussian processes on Riemannian manifolds to be trained using standard methods and makes them accessible to machine learning practitioners
Stacked Capsule Autoencoders
Objects are composed of a set of geometrically organized parts. We introduce
an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric
relationships between parts to reason about objects. Since these relationships
do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE
consists of two stages. In the first stage, the model predicts presences and
poses of part templates directly from the image and tries to reconstruct the
image by appropriately arranging the templates. In the second stage, SCAE
predicts parameters of a few object capsules, which are then used to
reconstruct part poses. Inference in this model is amortized and performed by
off-the-shelf neural encoders, unlike in previous capsule networks. We find
that object capsule presences are highly informative of the object class, which
leads to state-of-the-art results for unsupervised classification on SVHN (55%)
and MNIST (98.7%). The code is available at
https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencodersComment: NeurIPS 2019; 14 pages, 7 figures, 4 tables, code is available at
https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencoder
Effectiveness and resource requirements of test, trace and isolate strategies for COVID in the UK
We use an individual-level transmission and contact simulation
model to explore the effectiveness and resource requirements of
various test-trace-isolate (TTI) strategies for reducing the spread
of SARS-CoV-2 in the UK, in the context of different scenarios
with varying levels of stringency of non-pharmaceutical
interventions. Based on modelling results, we show that selfisolation
of symptomatic individuals and quarantine of their
household contacts has a substantial impact on the number of
new infections generated by each primary case. We further
show that adding contact tracing of non-household contacts of
confirmed cases to this broader package of interventions
reduces the number of new infections otherwise generated by
5–15%. We also explore impact of key factors, such as tracing
application adoption and testing delay, on overall effectiveness
of TTI
The Mondrian Kernel
We introduce the Mondrian kernel, a fast approximation to the Laplace kernel. It is suitable for both batch and online learning, and admits a fast kernel-width-selection procedure as the random features can be re-used efficiently for all kernel widths. The features are constructed by sampling trees via a Mondrian process [Roy and Teh, 2009], and we highlight the connection to Mondrian forests [Lakshminarayanan et al., 2014], where trees are also sampled via a Mondrian process, but fit independently. This link provides a new insight into the relationship between kernel methods and random forests.Gatsby Charitable Foundation, Alan Turing Institute, Google, Microsoft Research and Engineering and Physical Sciences Research Council (Grant ID: EP/N014162/1), NSERC (Discovery Grant), European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) (Grant agreement no. 617071
How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?
To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, and their sensitivity to unobserved factors. Models which account for noise in disease transmission compare favourably. We further evaluate how robust estimates are to different choices of epidemiological parameters and data. Focusing on models that assume transmission noise, we find that previously published results are robust across these choices and across different models. Finally, we mathematically ground the interpretation of NPI effectiveness estimates when certain common assumptions do not hold
Location Dependent Dirichlet Processes
Dirichlet processes (DP) are widely applied in Bayesian nonparametric
modeling. However, in their basic form they do not directly integrate
dependency information among data arising from space and time. In this paper,
we propose location dependent Dirichlet processes (LDDP) which incorporate
nonparametric Gaussian processes in the DP modeling framework to model such
dependencies. We develop the LDDP in the context of mixture modeling, and
develop a mean field variational inference algorithm for this mixture model.
The effectiveness of the proposed modeling framework is shown on an image
segmentation task
Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos
Automatically recognizing activities in video is a classic problem in vision and helps to understand behaviors, describe scenes and detect anomalies. We propose an unsupervised method for such purposes. Given video data, we discover recurring activity patterns that appear, peak, wane and disappear over time. By using non-parametric Bayesian methods, we learn coupled spatial and temporal patterns with minimum prior knowledge. To model the temporal changes of patterns, previous works compute Markovian progressions or locally continuous motifs whereas we model time in a globally continuous and non-Markovian way. Visually, the patterns depict flows of major activities. Temporally, each pattern has its own unique appearance-disappearance cycles. To compute compact pattern representations, we also propose a hybrid sampling method. By combining these patterns with detailed environment information, we interpret the semantics of activities and report anomalies. Also, our method fits data better and detects anomalies that were difficult to detect previously
- …